122
11
Randomness and Complexity
In anticipation of the following sections, we can already note that incompress-
ibility (i.e., the total absence of regularities) forms a criterion of randomness. This
criterion uses the notion of algorithmic complexity. The first sequence can be gener-
ated by the brief instruction “write ‘1’ 32 times” and the second by the only marginally
longer statement “write ‘01’ 16 times”, whereas the third, which was generated by
blindly tapping on a keyboard, has no apparent regularity.
“Absence of pattern” corresponds to the dictionary synonym “haphazard” (cf. the
French expression “au hasard”). By counting the number of 1s and 0s in a long seg-
ment of the third sequence, we can obtain an estimate of the probability of occurrence
of each symbol. “Haphazard” then means that the choice of each successive symbol is
made independently, without reference to the preceding symbol or symbols, in sharp
contrast to the second sequence, which could also be generated by the algorithm “if
the preceding symbol is 1, write 0, otherwise write 1” operating on a starting seed
of 1.
Note how closely this exercise of algorithmic compression is related to the general
aim of science: to find the simplest set of axioms that will enable all the observable
phenomena studied by the branch of science concerned to be explained (an empirical
fact being “explained” if the propositions expressing it can be shown to be a conse-
quence of the axioms constituting the scientific theory underpinning that branch). For
example, Maxwell’s equations turned out to be suitable for explaining the phenomena
of electromagnetism. 1
The meaning of randomness as denoting independence from what has gone before
is well captured in the familiar expression “random access memory”, the significance
being that a memory location can be selected arbitrarily (cf. the German “beliebig”,
at whim), as opposed to a sequential access memory, whose elements can only be
accessed one after the other. Mention of memory brings to mind the fact that suc-
cessive independent choices imply the absence of memory in the process generating
those choices.
The validity of the above is independent of the actual probabilities of choosing
symbols; that is, they may be equal or unequal. Although in many organisms it turns
out that the frequencies of occurrence of all four bases are in fact equal, this is by no
means universal, it being well known that thermophilic bacteria have more C identical to≡G
base pairs than Aequals= T in their genes, since the former, being linked by three hydrogen
bonds, are more thermally stable than the latter, which only have two (cf. Fig. 15.3).
Yet, we can still speak of randomness in this case. In binary terms, it corresponds to
unequal probabilities of heads or tails, and the sequence may still be algorithmically
1 An obvious corollary of this association of randomness with algorithmic compressibility is that
there is an intrinsic absurdity in the notion of an algorithm for generating random numbers, such
as those included with many compilers and other software packages. These computer-generated
pseudorandom numbers generally pass the usual statistical tests for randomness, but little is known
about how their nonrandomness affects results obtained using them. Quite possibly the best heuristic
sources of (pseudo)random digits are the successive digits of irrational numbers likepiπ orStartRoot 2 EndRoot
√
2. These
can be generated by a deterministic algorithm and, of course, are always the same, but in the sense
that one cannot jump to (say) the hundredth digit without computing those preceding it, they do
fulfil the criteria of haphazardness.